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(57) Method and apparatus for encoding digital in- 
formation to be recorded on a magnetic medium is dis- 
closed. The invention provides for receiving a sequence 
of (2 m n + d) user bits, mapping the sequence of user 
bits to 2 m dc-free codewords, and recording the 2 m dc- 
free codewords on a magnetic medium. A modulation 
coder, which includes a memory containing multiple 
non-intersecting subconstellations of dc-free code- 
words, performs the mapping in a non-equiprobable 
manner such that a particular codeword from a larger 
subconstellation is more likely to be used than a partic- 
ular codeword from a smaller constellation. Less desir- 
able codewords, such as those containing relatively 
long strings of bits having the same value, are assigned 
to the smaller subconstellations, thereby lessening the 
likelihood of loss of timing and gain parameters in the 
system, as well as maximizing the transmission rate and 
efficient use of the set of possible dc-free sequences of 
a given length. 
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D scriptlon 

CROSS-REFERENCE TO RELATED APPLICATIONS 

This application is related to United States Patent 
Application Serial No. 08/515,445, Attorney Docket No. 
E. Soljanin 1 , entitled "Method and Apparatus for Gen- 
erating DC-Free Sequences, " which was filed on August 
15, 1995 and which is commonly assigned to the as- 
signee of the present invention. 

FIELD OF THE INVENTION 

The present invention relates generally to the field 
of coding for digital systems, and, in particular, improved 
methods and apparatus for generating high rate codes 
that are dc-free and suitable for recording information 
on a magnetic medium. 

BACKGROUND OF THE INVENTION 

Information, such as signals representing voice, da- 
ta, video or text, must typically be processed before the 
information can be transmitted over a communications 
channel or recorded on a medium. First, the information, 
if not already in digital form, is digitized, for example by 
an analog-to-digital converter. Next, the digital informa- 
tion may be "compressed" to represent the information 
by a fewer number of bits. Any savings due to compres- 
sion are, however, partially offset by processing the 
compressed information using error correcting codes. 
Error correcting codes introduce additional bits to a sig- 
nal to form an encoded signal. The additional bits im- 
prove the ability of a system to recover the signal when 
the encoded signal has been corrupted by noise intro- 
duced by a communications channel or by a recording 
medium. 

A further type of coding used in transmission and 
recording systems is modulation coding. As with error 
correcting codes, modulation coding can improve a sys- 
tem's immunity to noise. Modulation codes also can ad- 
vantageously be used to regulate timing and gain pa- 
rameters in recording and communications systems. 

For example, consider a system which reads infor- 
mation stored on a magnetic medium. In non -ret urn -to- 
zero-inverse (NRZI) recording, for example, a binary "1 " 
is recorded on a portion of the magnetic medium by 
causing a change in the magnetization or magnetic flux 
of that portion of the medium. A binary "0" is recorded 
by causing no change in magnetization. The bits are 
read by detecting a sequence of changes in a voltage 
signal caused by changes in the magnetization of por- 
tions of the medium. The voltage signal, however, may 
be corrupted by noise in the recording system. The volt- 
age is typically a pulse each time a "1 " is detected and 
just noise each time a "0" is detected. The position of 
the pulses is used to set timing parameters in the sys- 
tem, and the height of the pulses is used to set gain pa- 



rameters in the system. If, however, a long string of ze- 
ros is read, there is no voltage output other than noise, 
and hence no timing or gain information, thereby leading 
to a loss of, or drift in, timing and gain parameters in the 
s system. 

Modulation coding thus may be used to ensure that 
the recording or transmission of a long string of binary 
zeros is avoided. Modulation coding may be implement- 
ed, for example, by dividing digital information that is to 
be recorded into sets of bits, called information words. 
Each information word is then used to select a codeword 
in a codebook. The codewords in the codebook are of 
length N bits where the codeword bits define a channel 
sequence, in other words, a sequence of symbols to be 
sent over a channel. For example, a binary "1 " in a code- 
word may represent the symbol "-1" or negative mag- 
netic flux, and a binary "0" in a codeword may represent 
the symbol "+1" or positive magnetic flux. If the code- 
words in the codebook do not contain a long string of 
zeros, then the selected codewords recorded on the me- 
dium will likewise not contain a long string of zeros, 
thereby obviating the timing and gain control problem. 

Additionally, it is often desirable to use channel se- 
quences that have a spectral null at zero (dc) frequency 
by which it is meant that the power spectral density func- 
tion of the channel sequence at dc equals zero. Such 
sequences are said to be dc-free. One way to assure a 
dc-free sequence is to design a system in which the 
block digital sum, or the arithmetic sum, of symbols in a 
codeword transmitted over a channel is zero. However, 
efficient or high-rate modulation codes that can prevent 
long strings of zeros from occurring without adding an 
excessive number of redundant bits to the information 
to be recorded, and that are dc-free typically require 
both codewords and codebooks of larger sizes as dis- 
cussed further below. 

It is known that the power spectral density function 
of a channel sequence x, where x= ... x^.Xq.x-, ... , van- 
ishes at zero frequency if and only if its running-digital- 
sum (RDS), defined as 




is bounded. It is also known, for example, how to trans- 
late sequences of symbols from the symbol alphabet of 
the error-correcting code symbols into channel se- 
50 quences with bounded RDS's by means of dc-free mod- 
ulation codes which may be finite-state codes or block 
codes. Block codes, for example, take blocks of M sym- 
bols, called information words, and map them into 
blocks of N channel symbols or sequences called code- 
cs words. Several factors favor the use of block codes. One 
such factor is limited error propagation since the sym- 
bols used to encode one block are not used in encoding 
any other block and thus errors in encoding are typically 
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confined to a particular block. Another factor is ease of 
implementation. One way to organize the mapping of 
information words to codewords is to form a code book 
or look-up table of 2 M codewords and use an M-bit input 
word to specify or address an N-bit codeword in the 
codebook. The ratio M/N defines the rate R of the mod- 
ulation code. 

To ensure that an arbitrary sequence of codewords 
has a bounded RDS, each codeword w-w^w^ ... w N is 
required to have a block digital sum (BDS), defined as 



BDS 



j.O 



equal to zero. Codewords of bipolar symbols, for exam- 
ple, +1 and 

-1 , and having a BDS equal to zero, are possible only if 
the codeword length N is even and if half the symbols 
are -1 and half the symbols are +1 . The number of such 
codewords is then equal to 



N 
N/2 



where 



(N) (N-1)-(N-N/2*1) 
(N/2) (N/2-l)--(l) 



However, at most 2 M codewords having a BDS equal to 
zero can be used to form a codebook for an M/N rate 
code, where 



M - floor 



log 



and where the function floor[x] returns the largest inte- 
ger less than or equal to x. The code rate R = M/N indi- 
cates that for every M information bits, N channel bits 
are generated, with N > M. 

The above explanation is rendered more clear by 
use of a specific example. Consider a sequence having 
a block length N equal to 4. There are 16 possible se- 
quences, 6 of which are dc-free. By using a block length 
of four bits, however, the value of M equals 2, and two 
of the dc-free codewords will not be used in the code- 
book. In some cases, the requirement that 



M - floor log 



f - ) 



causes a substantial number of extra dc-free sequences 
not to be used. For example, if N=B, the number of dc- 
free sequences is 70, but the codebook is of size 64 and 
thus 6 dc-free sequences are not used. Similarly, if 

io N=10, there are 252 possible dc-free sequences. The 
codebook, however, is of size 128. Thus, 124 sequenc- 
es are discarded thereby lowering the code rate from 
approximately 0.8 to 0.7. 

Furthermore, in magnetic recording applications, it 

is is desirable that modulation codes have rates higher 
than 3/4 so that more information can be written on the 
recording medium. Codes having a relatively long block 
length are required for rates above 3/4. Also, large code- 
books are required where the codewords in the code- 

20 books are dc-free. For example, a code of rate 11/14 
requires a block length of 14 and a codebook size of 
2048, and a code of rate 13/16 requires a block length 
of 16 and a codebook of size 8192. Such large code- 
books, however, typically require the implementation of 

25 more complex circuitry and often require large power 
consumption and large area on integrated circuits rela- 
tive to other elements in the transmission or recording 
system. Also, the larger the codebook, the more time it 
takes to access codewords in the codebook. Although 

30 some techniques have been proposed to reduce the 
size of the codebooks, these techniques add additional 
complexity and do not substantially reduce the size of 
the codebooks. Thus, there is a need for a method and 
apparatus for generating high rate codes that are dc- 

35 free and su itable for recording information on a magnet- 
ic medium. 

SUMMARY OF THE INVENTION 

40 The present invention addresses the aforemen- 
tioned problems by providing a novel method and appa- 
ratus for encoding digital information to be recorded on 
a magnetic medium. The method preferably includes 
the steps of receiving a sequence of (2 m n + d) user bits 

45 and mapping the sequence of user bits to 2 m dc-free 
codewords. In particular, the method preferably includes 
selecting 2 m dc-free codewords from among a plurality 
of non-intersecting sub-constellations of dc-free code- 
words of a particular length, wherein the step of select- 

50 ing comprises the step of identifying from which of the 
plurality of subconstellations each of the 2 m dc-free 
codewords will be selected by using a portion of the se- 
quence of user bits in a predetermined order and the 
step of specifying the 2 m dc-free codewords based upon 

55 the remaining user bits in the sequence. The selected 
2 m dc-free codewords may then be recorded on a mag- 
netic medium. 

A modulation coder, which preferably includes a 
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codebook comprising the plurality of non-intersecting 
subconste Hat ions of dc-free codewords, performs the 
mapping in a non-equiprobable manner such that a par- 
ticular codeword from a larger sub-constellation is more 
likely to be used than a particular codeword from a 
smaller constellation. In particular, codewords with more 
bit transitions or more frequent bit transitions preferably 
are assigned to subconstellations different from code- 
words with fewer or less frequent bit transitions. Specif- 
ically, codewords with a relatively high number of bit 
transitions are preferably assigned to the larger subcon- 
stellations, whereas codewords with a relatively low 
number of bit transitions are preferably assigned to the 
smaller subconstellations, thereby lessening the loss of 
timing and gain parameters in the magnetic recording 
system, as well as improving the transmission rate and 
efficient use of the set of possible dc-free sequences of 
a given length. 

The present invention also provides a method and 
apparatus for decoding information stored on a magnet- 
ic medium. A more complete understanding of the 
present invention, as well as other features and advan- 
tages of the present invention, may be obtained with ref- 
erence to the following detailed description and accom- 
panying drawings. 

BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 is a block diagram of an exemplary system 
in which the invention may be practiced. 

FIG. 2 is a flow chart showing the steps for design- 
ing a modulation coder with a codebook according to 
the principles of the present invention. 

FIG. 2A ia aflow chart showing the steps for assign- 
ing codewords to subconstellations in the codebook in 
accordance with the present invention. 

FIGS. 3A- 3C illustrate the creation of an exemplary 
constellation tree according to the principles of the 
present invention. 

FIG. 4 is a table showing the relative sizes and the 
actual probabilities of utilization which result from using 
the encoding algorithm of the present invention with re- 
spect to an exemplary set of four subconstellations. 

FIG. 5 is a flow chart showing the steps used in 
mapping user bits to codewords according to the 
present invention. 

FIG. 5 A is a flow chart showing the steps of the en- 
coding algorithm in accordance with the present inven- 
tion. 

FIG. 6 and FIGS. 6A - 6D illustrate an exemplary 
sequence of user bits and the application of the encod- 
ing algorithm of the present invention using the exem- 
plary sequence of user bits. 

FIG. 7 is aflow chart showing the steps of recover- 
ing the digital information stored on a magnetic medium 
in accordance with the present invention. 



DETAILED DESCRIPTION OF THE INVENTION 

FIG. 1 is a block diagram of an exemplary system 
in which the invention may suitably be practiced. The 

5 system of FIG. 1 is particularly useful for recording dig- 
ital information on and reading digital information from 
a magnetic medium such as those employed in digital 
audio tapes on disk drives. 

The information to be recorded preferably is first 

10 "compressed" using a Lempel-Ziv compressor 110 so 
as to reduce the amount of information that must be re- 
corded on the medium thereby saving time and money. 
Next, the compressed information preferably is sent as 
an input to an encoder 120 which encodes the com- 

15 pressed information using Reed-Solomon codes. The 
purpose of Reed-Solomon encoding is to adjoin extra 
symbols to the compressed information so that noise in- 
troduced in the reading process will not cause errors 
when the information is received. Lempel-Ziv and Reed- 

20 Solomon encoding are described in greater detail in 
Timothy C. Bell at al., Text Compression, Prentice-Hall, 
Englewood Cliffs, NJ, 1 990 and S. Lin and D.J. Costello, 
Error Control Coding, Prentice-Hall, Englewood Cliffs, 
NJ, 1983, respectively, which are incorporated herein 

25 by reference. Other coding techniques which provide 
the same effect may suitably be employed. 

The output of the encoder 1 20 is a series of symbols 
where each symbol is represented by a set of bits. The 
symbols are sent as an input to a modulation coder 1 30 

30 which is designed to map each of the input symbols into 
a dc-free sequence or codeword having a predeter- 
mined number of bits. The modulation coder 130 may 
be implemented using a processor designed or pro- 
grammed to perform the functions explained below In 

35 particular, the modulation coder 130 may conveniently 
be implemented, for example, by a microprocessor pro- 
grammed with appropriate firmware. For example, an 
80C51 microprocessor manufactured by the Philips 
Semiconductors company, may suitably be used to im- 

40 plement the coder 130. Alternatively, the modulation 
coder 1 30 may be suitably implemented by a general 
purpose processor or in a specially designed semicon- 
ductor chip or other dedicated hardware. In any event, 
the dc-free sequences or codewords are then transmit- 

4$ ted from the modulation coder 1 30 to a write head 1 35 
and recorded on a magnetic medium 1 40, such as digital 
audio tapes on a disk drive. 

Signals representing the information recorded on 
the magnetic medium 140 may then be read by a read 

50 head 145 and preferably sent to an equalizer 150 which 
controls intersymbol interference. Output signals from 
the equalizer are then sent to a modulation decoder 160, 
a Reed-Solomon decoder 170 and a Lempel-Ziv de- 
compressor 180, respectively, so as to recover the in- 

55 formation recorded on the magnetic medium 140. 

FIG. 2 is a flow chart showing the steps for program- 
ming or designing the modulation coder 1 30 with a code- 
book for use in mapping a sequence of bits using a spec- 
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ified rate. First, as indicated by step 200, a constellation 
tree is created corresponding to the desired average 
rate, where the average rate p has the form p = (n + d/ 
2 m ) bits per codeword, where 2 m is the dimensionality 
of the code, n is a positive integer, and d is a positive 
integer less than 2 m . A generalized cross constellation 
is a set of non-intersecting sub-constellations of channel 
sequences or codewords from which N-dimensional dc- 
free channel sequences are drawn. Each subconstella- 
tion is a subset of the entire constellation of channel se- 
quences or codewords. The relative sizes of the sub- 
constellations determine both the rate of the encoder 
and the dimensionality of the code. A specific code is 
thus uniquely defined. 

The constellation tree comprises a root node and at 
least one additional level of nodes, including leaf nodes 
at the lowest level of the constellation tree. Each node, 
except the leaf nodes, has first and second branches 
such that each subsequent level of nodes has twice as 
many nodes as the previous level. This is illustrated, for 
example, in FIG. 3A. The root node represents the entire 
constellation Q, whereas the subconstellations £2 00 , 
ft 01 , ft 10 , Q.^ are represented in FIG. 3A by the leaf 
nodes at the lowest level of the constellation tree. 

In order to aid in understanding the present inven- 
tion, it will be helpful to use a particular example. Con- 
sider, for example, a rate of (7 + 3/4). Using this rate, n 
= 7, d = 3, and m = 2. Next, consider the binary expan- 
sion of 



The number of levels in the constellation tree, excluding 
the root node, is w, where w equals the number of indi- 
ces i for which dj is not equal to zero. Using the example 
indicated above, d = 3 = 2° + 2 1 . Thus, in the example, 
w equals 2, as shown in FIG. 3A. 

Next, the branches of the constellation tree are la- 
belled with a weight as explained below. Let p(0) < p(1) 
< . . . < p(w-1) be the members of the set P which rep- 
resents the ordering of indices i for which dj is not equal 
to zero. Again, using the above example, P = (0, 1 ), with 
p(0) = 0, and p(1 ) = 1 . As indicated by step 204, the first 
branch of each node, or each of the left-going branches 
of the constellation tree, are labelled with a weight of 
zero. Next, as indicated by step 206, the second branch 
of each node, or each of the right-going branches, are 
labelled with the weight (m - p(j)), where j is the level of 
the tree and p(j) is defined as above. FIG. 3B illustrates 
the resulting constellation tree and relative weights of 
the branches for the above example. 

Once the constellation tree is created, the relative 
weight r(i) of each subconstellation is determined, as in- 
dicated in step 210. To determine the relative weight of 
a particular subconstellation represented by one of the 
leaf nodes, the sum of the weights along the path com- 



mencing from the root node and terminating with the 
particular leaf node is calculated. The relative weights 
corresponding to the above example are indicated in 
FIG. 3C. 

s The size or number of entries of each subconstel- 
lation may be determined in step 220 from the relative 
weights according to the formula (2 n x 2* r (0). Thus, as 
indicated by FIG. 3C, the subconstellation CIqq in the 
above example has a constellation size of 128, the sub- 
constellation Q 01 has a size of 64, the subconstellation 
O 10 has a size of 32, and the subconstellation Q u has 
a size of 16. The total constellation O, in this example, 
therefore, contains 240 codewords. It will be noted that 
the sizes of the subconstellations are different from one 
another. In addition, the number of bits or length of the 
codewords is preferably chosen to be the minimum 
length N, where 

(n/ 2 ) 

is greater than or equal to the total number of codewords 
in the constellation Q. Thus, in the above example, each 
of the 240 dc-free codewords preferably has a length of 
ten bits. 

As indicated by step 230, the codewords are then 
assigned to the subconstellations. Since, as explained 
above, there are 252 possible 10-bit dc-free channel se- 
quences, a decision must be made as to which twelve 
of the 252 possible dc-free channel sequences will be 
discarded. Furthermore, a decision must be made as to 
how the remaining 240 codewords will be divided 
among and assigned to the subconstellations, C1 0Q , £2^, 

"io. "li- 
lt will be evident from the encoding algorithm dis- 
cussed below that the various subconstellations are not 
utilized in proportion to their relative sizes. For example, 
FIG. 4 is a table showing the relative sizes and the actual 
probabilities of utilization which result from using the 
mapping technique discussed below for the four sub- 
constellations discussed above. While channel se- 
quences or codewords from a particular subconstella- 
tion are selected with equal probability, the encoding al- 
gorithm discussed below favors channel sequences or 
codewords from the larger constellations at the expense 
of the smaller constellations. This non-equiprobable 
feature allows the less desirable dc-free sequences to 
be assigned to the smaller subconstellations, thereby 
decreasing the likelihood that they will be utilized during 
the coding process. Conversely, the more desirable dc- 
free sequences may be assigned to the larger subcon- 
stellations, thereby increasing the likelihood that they 
will be utilized during the coding process. 

In certain communication systems using, for exam- 
ple, voiceband data transmission or fading channels, an 
important concern is the elimination of high energy sym- 
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bols. In the magnetic recording context, however, elim- 
inating high energy symbols is not an important issue 
because each bit in a channel sequence or codeword 
represents either a positive or negative magnetic flux of 
+1 or-1, respectively. Rather, as discussed above, a de- 
sirable feature for a magnetic recording system is the 
ability to generate high rate codes that are dc-free and 
that obviate the timing and gain problems which result 
from long strings of consecutive bits having the same 
value. 

As indicated in step 232 of FIG. 2A, in a presently 
preferred embodiment, dc-free channel sequences that 
include a relatively long string of consecutive bits having 
the same value, and, in particular those channel se- 
quences wherein the string of consecutive bits which 
have the same value appears at the beginning or at the 
end of the sequence, are discarded. Thus, for example, 
the channel sequences 1111100000 and 0000011111 
would be discarded. Also, as indicated by step 234, oth- 
er channel sequences containing relatively long strings 
of consecutive bits having the same value are preferably 
assigned to the smaller subconstellations. For example, 
the channel sequences 0001111100 and 1110000011 
would be assigned to the subconstellation Q 1V Con- 
versely, as shown in step 236, channel sequences that 
do not contain a relatively long string of consecutive bits 
having the same value are preferably assigned to the 
larger subconstellations. The channel sequences 
1010101010 and 0101010101 would, therefore, be as- 
signed to the subconstellation Q 00 . In general, code- 
words with fewer bit transitions are preferably assigned 
to the smaller subconstellations, whereas codewords 
with more bit transitions are preferably assigned to the 
larger subconstellations. 

Finally, as indicated by step 238, with respect to the 
remaining channel sequences, those codewords which 
are similar to one another are preferably assigned to dif- 
ferent subconstellations, whenever possible. For exam- 
ple, channel sequences which are identical except for 
two bits which are interchanged would preferably be as- 
signed to different subconstellations. 

Assigning the possible dc-free channel sequences 
to the subconstellations in the above manner helps re- 
duce the loss of timing and gain parameters in the sys- 
tem. It also reduces the likelihood of errors occurring in 
the transmission, recording and identification of the 
channel sequences. This is because, as a result of the 
assignment process for assigning codewords to the 
subconstellations according to the present invention, 
codewords with frequent bit 'transitions inherently have 
fewer nearest neighbors. Codewords with fewer nearest 
neighbors allow fewer errors to propagate into later 
stages of decoding. 

As indicated in step 235, each codeword in a par- 
ticular one of the subconstellations is also assigned a 
q-bit address, where q is the minimum number of bits 
required to address all the codewords in the particular 
subconstellation. Specifically, for a subconstellation of 



size Q, each address is of length q, where q = log^Q. 
Once the codewords are assigned to the subconstella- 
tions, the codewords are preferably stored in a code- 
book 132 in the modulation coder 130, as indicated by 
5 step 250. The codebook 132 may be advantageously 
implemented in memory 1 31 such as read only memory 
or random access memory. The correspondence be- 
tween the codewords and the subconstellations, as well 
as the addresses assigned to each codeword, also are 
10 stored in the memory 131. Although the embodiment 
shown in FIG. 1 includes a single codebook which in- 
corporates all of the subconstellations, it should be un- 
derstood that the various subconstellations may be 
stored in a plurality of codebooks. Thus, for example, 
is each subconstellation may be stored in a separate 
codebook. 

The determination of the number and size of the 
subconstellations, as well as the assignment of the dc- 
free codewords to the subconstellations, may be per- 
formed by a person designing the modulation coder 1 30. 
In that situation, the codewords, the correspondence 
between the codewords and subconstellations, and the 
address assigned to each codeword, are stored in the 
memory 131 when the modulation coder 130 is de- 
signed. In an alternative embodiment of the present in- 
vention, the modulation coder 130 is programmed to 
perform the steps 200 through 239 in FIG. 2, as well as 
the steps 232 through 238 in FIG. 2A, in response to 
receiving a signal indicating the desired rate p. For this 
purpose, the modulation coder 130 preferably has at 
least one input lead 1 33 for receiving the values of n, m 
and d. The values n, m and d, or their equivalents, may 
be entered into the coder 1 30 by a user of the system 
employing, for example, a keyboard. The modulation 
coder 130 would then perform the steps,200 through 
239 in FIG. 2, as well as the steps 232 through 238 in 
FIG. 2A. The assigned codewords, the correspondence 
between the codewords and the subconstellations, and 
the address of each codeword, would then be stored in 
the memory 131. 

The discussion that follows describes how the mod- 
ulation coder 1 30 is further designed or programmed to 
map a sequence of 

2 m p = (2 m n + d) user bits received from the output of the 
Reed-Solomon coder 120 to 2 m dc-free channel se- 
quences or codewords stored in the codebook 132. Re- 
turning to the example used earlier in which a rate of (7 
+ 3/4) bits per codeword was considered, m is equal to 
two, resulting in four channel sequences or codewords 
and a total of thirty-one user bits. Thus, the modulation 
coder 130 maps the thirty-one user bits into four 10-bit 
dc-free channel sequences or codewords. The four as- 
yet unspecified codewords conveniently may be la- 
belled Aq, A v A 2 and A 3 . 

FIG. 5 is a flow chart showing the steps used in 
mapping the user bits to the codewords. In general, the 
modulation coder 130 utilizes a particular sequence of 
user bits to determine from which subconstellation the 
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codewords will be selected and to Identify the particular 
codewords which will be sent to the magnetic medium 
140 and recorded thereon. More particularly, as ex- 
plained in greater detail below, the modulation coder 
1 30 preferably is designed to perform steps 300 through 
340 in FIG. 5, as well as steps 311 through 318 in FIG. 
5A. 

As indicated by step 300, a partitioning technique 
is performed as follows. For each of the members of the 
set P defined above, the entire sequence of the as-yet 
unspecified 2 m codewords, Aq . . . A 2m . ^ , is partitioned 
into 2 P non-overlapping subdivisions T p { where 0 < i < 
2P. Again, referring to the example discussed above, P 
= {0, 1 }. Thus, for p = 0, there will be one subdivision T 0 0 
which includes all four of the as-yet unspecified code- 
words, Aq, A-,, A 2 and A 3 . Similarly, for p = 1, there will 
be two subdivisions, T 1t0 and T 1f1 , each of which in- 
cludes two of the four as-yet unspecified codewords. 
The partitioned subdivisions need not be contiguous 
sets of codewords and are disjoint for different p. Thus, 
for the purposes of illustration, the subdivision T 1j0 will 
include the as-yet unspecified codewords Aq and A 3 , 
and the subdivision Tj ^ will include the as-yet unspec- 
ified codewords A 1 and A 2 . 

Next, as indicated by step 31 0, the sequence of us- 
er bits that are received at the input to the modulation 
coder 130 are used to determine and identify which of 
the sub-constellations each of the as-yet unspecified 
codewords comes from, according to an encoding algo- 
rithm, described further below with respect to FIG. 5A. 
The encoding algorithm may be viewed as "walking" 
each of the as-yet unspecified codewords down the con- 
stellation tree until each one is associated with one of 
the leaf nodes in the lowest level of the constellation 
tree. 

As indicated by 311 in FIG. 5A, one assumes that 
the 2 m as-yet unspecified codewords are at the root of 
the constellation tree. As shown in step 31 2, each of the 
partitioned subdivisions T Pii is addressed, in ascending 
order of p, by using a portion of the user bits in a prede- 
termined order. Preferably, the user bits are used in se- 
quential order. Next, as indicated by step 31 3, for each 
partitioned subdivision, the next user bit in the sequence 
determines whether all of the as-yet unspecified code- 
words in that subdivision proceed down the constella- 
tion tree over the first, or left, branch from each of their 
respective current positions, or whether exactly one as- 
yet unspecified codeword proceeds down the second, 
or right, branch, while the remaining as-yet unspecified 
codewords proceed down the first, or left, branch. For 
example, as shown by steps 31 4 and 31 5, respectively, 
a user bit of value "0" would indicate that all the as-yet 
unspecified codewords in the particular subdivision pro- 
ceed down the first, or left, branch, whereas a user bit 
of value "1" would indicate that exactly one as-yet un- 
specified codeword in the particular subdivision pro- 
ceeds down the second, or right, branch, while the re- 
maining as-yet unspecified codewords in the particular 



subdivision proceed down the first, or left, branch. In the 
latter situation, as shown in step 316, the next (m - p) 
user bits in the sequence are used to identify and deter- 
mine which particular as-yet unspecified codeword pro- 

s ceeds down the second, or right, branch. This process 
is continued, as indicated by step 317, until all the par- 
titioned subdivisions have been addressed. Once all the 
partitioned subdivisions have been addressed, the en- 
coding algorithm of step 310 ends, as shown by 318. 

10 it should be noted that, although in the above dis- 
cussion, with respect to steps 204, 206, 314, 315 and 
316, the first branch of each node has been associated 
with the left-going branch and the second branch has 
been associated with the right-going branch, the asso- 

15 ciation of the respective first and second branches may 
be reversed if done consistently for all relevant branch- 
es. 

Referring again to the example discussed above, 
consider the exemplary sequence of thirty-one user bits 

20 100011. ..b 31 shown in FIG. 6. The four as-yet unspeci- 
fied codewords, designated Aq, A-, , A 2 and A 3 , are as- 
sumed to be sitting at the root node Q, as shown in FIG. 
6A. The first user bit, whose value is "1*. indicates that 
exactly one of the four codewords Aq, A 1 , A 2 and A 3 in 

25 the subdivision T 0 0 proceeds down the right branch. 
The next two bits, which are both "0", are used to specify 
which one of the four codewords proceeds down the 
right branch. Again, for the purposes of illustration, Aq 
is assumed to be the specified codeword that proceeds 

30 down the right branch. A 1 , A 2 and A 3 thus proceed down 
the left branch. The result is shown in FIG. 6B. 

The next user bit, which is a "0", indicates that both 
codewords in the next subdivision, T 1>0 for example, 
proceed down the left branches from their current posi- 

35 tions. This situation is shown in FIG. 6C. The fifth user 
bit, which is a "1", indicates that exactly one of the un- 
specified codewords in the next subdivision, T t >1 , pro- 
ceeds down the right branch from its current branch. The 
next user bit, which is also a "1 ", is used to determine 

40 which of the two unspecified codewords, A A or A 2 , pro- 
ceeds down the right branch. For the purposes of illus- 
tration, it is assumed that A 1 is specified as the code- 
word which proceeds down the right branch from its cur- 
rent position. Thus, A 2 proceeds down the left branch 

45 from its current position. FIG. 6D shows the resulting 
situation. 

In the example discussed above, and as shown in 
FIG. 6D, two of the as-yet unspecified codewords, Ag 
and A 3 , will be selected from among the codewords in 

50 the subconstellation £2q 0 . One of the as-yet unspecified 
codewords, A-,, will be selected from the codewords in 
the subconstellation ft 01 , and the remaining as-yet un- 
specified codeword, Aq, will be selected from among the 
codewords in the subconstellation ft 10 . In this particular 

55 example, none of the as-yet unspecified codewords will 
be selected from the codewords in the subconstellation 
Oil- 

As should be clear form the example discussed 
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above, the determination of which subconstellation 
each of the as-yet unspecified codewords will be select- 
ed from requires use of fewer than all the bits in the se- 
quence of user bits. As indicated by step 320 and as 
shown in FIG. 6, the remaining user bits are utilized to 
determine which specific channel sequences or code- 
words are specified. In fact, the remaining user bits are 
exactly sufficient to address all of the codewords in the 
selected subconstellations for each of the as-yet un- 
specified codewords. The addressing may be achieved, 
for example, by the use of look-up tables. As noted 
above, for a subconstellation of size Q, q bits are re- 
quired to specify the address of a particular one of the 
codewords in that subconstellation, where q = log^Q. 

In the example discussed above, six of the thirty- 
one user bits were required to determine from which of 
the four subconstellations, ft 00 , £2 01 , Q 10 and £2^, each 
of the as-yet unspecified codewords, Aq, A 1( A 2 and A 3 
will be selected. The remaining twenty-five user bits may 
be used to specify the addresses of the particular code- 
words within each of the selected subconstellations. 
Seven bits, for example bits b 7 through b 13 , are required 
to determine which one of the 1 28 codewords in the sub- 
constellation Q 00 will be used for A 2 . Likewise, seven 
bits, for example b 14 through b 20 , are required to deter- 
mine which one of the 1 28 codewords in the subconstel- 
lation fy) 0 will be used for A 3 . Six user bits, for example 
b 21 through b 26 , are required to determine which one of 
the 64 codewords in the subconstellation U 01 will be 
used for A v and five user bits, for example b 27 through 
b 31 , are required to determine which one of the 32 code- 
words in the subconstellation Q 10 will be used for Aq. In 
this manner, all thirty-one user bits received from the 
output of the Reed-Solomon coder 1 20 are used to map, 
and thereby encode, the thirty-one user bits to four 
10-bit dc-free codewords obtained from the codebook 
132. 

Once the codewords that will be used for the encod- 
ing have been specified, the modulation coder 1 30 gen- 
erates a sequence of bits corresponding to the selected 
2 m dc-free codewords as indicated in step 330. A signal 
generating circuit or chip 134, for example, maybe used 
to generate the bits corresponding to the selected code- 
words. Other suitable means may also be employed to 
generate the sequence of bits corresponding to the se- 
lected codewords. Then, as indicated by step 340, the 
coder 1 30 preferably transmits the specified codewords 
to the write head 135. Then, in step 350, the write head 
135 records the specified codewords on the magnetic 
medium 140, preferably in a predetermined order, such 
as in the order corresponding to Aq, A^ , A 2 and A 3 . 

At some later time, as indicated by step 400 in FIG. 
7, the codewords recorded on the magnetic medium 1 40 
are read by the read head 145 and preferably sent to 
the equalizer 150. The 

(2 m n + d) user bits are recovered, as explained further 
below, by the modulation decoder 160. The modulation 
decoder 160 may be implemented, for example, using 



a processor appropriately designed or programmed to 
perform the inverse of the functions performed by the 
modulation coder 130. The modulation decoder 160 
thus allows the original (2 m n + d) user bits to be recov- 
5 ered and generated from the codewords that are read 
from the magnetic medium 140. 

In one embodiment, the modulation decoder 160 
has a look-up table 1 62 which is used by the modulation 
decoder 1 60 to perform the inverse of step 320 by map- 
ping each codeword to its corresponding portion of the 
sequence of user bits. The look-up table 1 62, which may 
be stored in a memory 163 such as a read-only memory 
or a random access memory, includes each codeword 
in the codebook 1 32 and its corresponding q-bit address 
assigned to it in step 239. The memory 1 63 also stores 
the correspondence between each codeword and the 
identity of the subconstellation to which it belongs. 

As indicated by step 404, the modulation decoder 
160 retrieves the q-bit address associated with each of 
the 2 m codewords read from the magnetic recording me- 
dium 140. Using the exemplary sequence of user bits 
discussed above with respect to FIG. 6, the modulation 
decoder 160 would retrieve the address bits b 7 through 
b 13 corresponding to the codeword that was designated 
by A2. Similarly, the modulation decoder 160 would re- 
trieve the address bits b 14 through b 20 , b 21 through b 26 , 
and b 27 through b 31 , corresponding, respectively, to the 
codewords that were designated by A 3 , A^ and Aq. 

Next, as indicated by step 406, the modulation de- 
coder 406 arranges the retrieved sets of address bits in 
the order in which the corresponding codewords were 
received. Again, with reference to the exemplary se- 
quence of user bits discussed above, the modulation 
coder would arrange the bits b 7 through b 31 in sequential 
order, corresponding to the order of the received code- 
words A 2 , A 3 , At and Aq. 

The present invention also provides an added level 
of error detection. This added level of error detection is 
possible because the encoding algorithm discussed 
above with respect to FIG. 5 A does not allow the 2 m 
codewords corresponding to a sequence of (2 m n + d) 
user bits to be selected from certain combinations of 
subconstellations. Thus, in the example discussed 
above, the four codewords may not all be selected from 
the subconstellation n 1v In fact, in that particular exam- 
ple, no more than one of the as-yet unspecified code- 
words may be selected from the subconstellation Q 1V 
As indicated by step 408, the modulation decoder 160 
retrieves from the memory 153 the identity of the sub- 
constellations to which each of the 2 m received code- 
words belongs. Then, as indicated in 410, the modula- 
tion decoder 160 detects whether an impermissible 
combination of codewords has occurred. Thus, with ref- 
erence to the example discussed above, the modulation 
decoder 160 preferably is programmed to detect the sit- 
uation, for example, where two or more codewords are 
selected from the subconstellation Q 1V The modulation 
decoder 160 is also programmed to detect other imper- 
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missible combinations of codewords. As indicated by 
step 415, if such an error is detected, the modulation 
decoder 160 preferably provides an error detection 
alarm or signal on an error detection electronic circuit 
166, for example. 

If no error is detected, then as indicated by step 420, 
the modulation decoder 160 generates the remaining 
bits in the original sequence of user bits based upon the 
combination and order of subconstellations to which the 
2 m received codewords belong. It will be recalled that, 
in step 310, a portion of the sequence of user bits was 
used to determine from which sub-constellation each of 
the as-yet unspecified codewords would be selected. 
The modulation decoder 160 now uses the identity and 
order of the subconstellations to recover that portion of 
the sequence of user bits. For this purpose, the modu- 
lation decoder 1 60 preferably includes a plurality of logic 
gates 164 or other dedicated hardware. Signals repre- 
senting the identity of the subconstellations to which the 
2 m received codewords belong are sent as inputs to the 
logic gates 1 64. The logic gates 1 64 preferably are hard- 
wired so that the output signals correspond to the re- 
maining portion of bits in the original sequence of user 
bits. 

Referring again to the example discussed above, 
the identity of the subconstellations retrieved in step 408 
is Q 00 , ftoo, Q^, and C2 10 . The output of the logic gates 
164 would then be "100011" which represents the first 
six bits in the original sequence of user bits as shown in 
FIG. 6. As indicated by step 422, these remaining bits 
are appended to the beginning of the arrangement of 
bits that resulted from step 406. In this manner, the en- 
tire sequence of user bits is recovered from the 2 m code- 
words recorded on the magnetic medium 140. 

Finally, as indicated by step 425, the digital informa- 
tion originally entered into the system may be recovered 
by sending signals representing the recovered se- 
quence of user bits to the Reed-Solomon decoder 170. 
The output of the Reed-Solomon decoder 170 is then 
preferably sent to the Lempel-Ziv decompressor 180, 
the output of which corresponds to the digital informa- 
tion originally entered into the system. 

The present invention thus provides several advan- 
tageous features. First, because the cross-constellation 
technique maps over multiple dimensions, it allows, a 
fractional number of bits to be mapped into each dc-free 
codeword. This allows greater flexibility in the selection 
of a desired rate. It also provides a non-equiprobable 
coding mechanism for favoring certain codewords over 
others. Specifically, for example, codewords that do not 
have relatively long strings of consecutive bits having 
the same value may be favored over those that do by 
assigning the favored codewords to the larger constel- 
lations. In this manner, the present invention provides 
an improvement for obviating the timing and gain prob- 
lems previously associated with reading information 
stored on magnetic media. 

Furthermore, for codewords having a specified 



length, the size of the codebook is larger than the size 
of the codebook that is realized by using previously 
known techniques such as the one described above in 
the background section. Thus, in the example discussed 
5 above, using dc-free codewords which have a length of 
ten bits results in a codebook containing 240 code- 
words. This compares with the codebook size of 128 
codewords for 10-bit codewords using the known tech- 
nique described above in the background section. As 
10 previously noted, larger codebooks allow higher rates 
to be achieved. The present invention, therefore, im- 
proves the transmission rate and efficiently uses the set 
of possible dc-free sequences of a given length for en- 
coding and recording information on magnetic media. 
75 The efficiency of the mapping may be measured by the 
coefficient expansion ratio which equals the number of 
codewords divided by 2P. Thus, for the example dis- 
cussed above, the coefficient expansion ratio equals 
240/27-75 or 1.12. 

Although the present invention has been described 
with reference to specific embodiments, it will be appre- 
ciated that other arrangements within the spirit and 
scope of the present invention will be readily apparent 
to persons of ordinary skill in the art. The present inven- 
tion is, therefore, limited only by the appended claims. 



Claims 

30 1. A method of encoding digital information at an av- 
erage rate of (n + d/2 m ) user bits per codeword, 
wherein n is a positive integer, and d is a positive 
integer less than 2 m , the method comprising the 
steps of: 

35 

receiving a sequence of (2 m n + d) user bits; 
selecting 2 m dc-free codewords from among a 
plurality of non-intersecting subconstellations 
of dc-free codewords, wherein codewords with 
40 the most frequent bit transitions are assigned 

to subconstellations different from codewords 
with the least frequent bit transitions, wherein 
the step of selecting comprises the steps of: 

4$ identifying from which of the plurality of 

subconstellations each of the 2 m dc-free 
codewords will be selected based upon a 
portion of the sequence of user bits; and 
specifying the 2 m dc-free codewords 
50 based upon the remaining user bits in the 

sequence; and 

generating a sequence of bits corresponding to 
the selected 2 m dc-free codewords. 

55 

2. The method of claim 1 wherein codewords with the 
most frequent bit transitions are assigned to larger 
subconstellations and codewords with the fewest 
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bit transitions are assigned to smaller subconstel- 
lations, wherein the step of identifying comprises 
executing an algorithm which uses the subconstel- 
lations in a non-equiprobable manner such that a 
particular codeword from a larger subconstellation 
is more likely to be used than a particular codeword 
from a smaller subconstellation. 

3. The method of claim 2 wherein the step of selecting 
comprises selecting the codewords from at least 
one codebook wherein codewords with more bit 
transitions are assigned to larger subconstellations 
than codewords with fewer bit transitions. 

4. The method of claim 2 wherein the number of sub- 
constellations is determined by creating a constel- 
lation tree based at least upon the value of d, where- 
in 

i-o 

and wherein a set P represents the ordering of in- 
dices for which dj is not equal to zero, wherein the 
step of identifying further comprises partitioning, for 
each member of the set R 2 m as-yet unspecified 
codewords into sets of non-overlapping subdivi- 
sions, and for each of said subdivisions, determin- 
ing, based upon at least one of the user bits in said 
portion of the sequence of user bits, whether all of 
the as-yet unspecified codewords in a particular 
subdivision proceed down respective first branches 
in the constellation tree, or whether exactly one of 
the as-yet unspecified codewords in the particular 
subdivision proceeds down a respective second 
branch in the constellation tree while the remaining 
as-yet unspecified codewords in the particular sub- 
division proceed down the respective first branch- 
es. 

5. The method of claim 4 wherein the step of identify- 
ing further includes the step of determining, based 
upon at least another one of the user bits in said 
portion of the sequence of user bits, which one of 
the as-yet unspecified codewords in a particular 
subdivision proceeds down the respective second 
branch, if it is determined that fewer than all of the 
as-yet unspecified codewords in the particular sub- 
division proceed down the respective first branch- 
es. 

6. The method of claim 5 wherein the step of specify- 
ing comprises the step of using the remaining user 
bits in the sequence to address entries in look-up 
tables wherein the look-up tables correspond to 
said subconstellations. 



7. The method of claim 5 wherein the step of identify- 
ing comprises the step of using the sequence of us- 
er bits in the order in which they are received. 

5 8. A method of encoding digital information at an av- 
erage rate of (n + d/2 m ) user bits per codeword for 
storage on a magnetic medium, wherein n is a pos- 
itive integer, and d is a positive integer less than 2 m , 
the method comprising the steps of: 

70 

receiving a sequence of (2 m n + d) user bits; 
selecting 2 m dc-free codewords from among a 
plurality of non-intersecting subconstellations 
of dc-free codewords, wherein the step of se- 
f5 lecting comprises the steps of: 

identifying from which of the plurality of 
subconstellations each of the 2 m dc-free 
codewords will be selected based upon a 
20 portion of the sequence of user bits; and 

specifying the 2 m dc-free codewords 
based upon the remaining user bits in the 
sequence; and 

25 recording the selected 2 m dc-free codewords 

on a magnetic medium. 

9. The method of claim 8 wherein the step of recording 
comprises the step of recording the 2 m dc-free 

30 codewords on digital audio tape. 

10. The method of claim 9 wherein the step of recording 
comprises the step of recording the 2 m dc-free 
codewords on a disk drive. 

35 

1 1 . The method of claim 8 further comprising the steps 
of: 

reading the 2 m dc-free codewords recorded on 
40 the magnetic medium; and 

mapping the 2 m dc-free codewords read from 
the magnetic medium to the sequence of (2 m n 
+ d) user bits so as to recover the original se- 
quence of user bits. 

45 

12. Themethodof claim 11 wherein the step of mapping 
comprises the step of using a look-up table to map 
each codeword to a respective portion of the se- 
quence of user bits. 

50 

13. The method of claim 12, wherein the step of map- 
ping further comprises the step of determining ad- 
ditional user bits in the sequence of user bits based 
upon the identity and order of the subconstellations 

55 from which the 2 m codewords were selected. 

14. The method of claim 8 further comprising the steps 
of: 
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reading the 2 m dc-free codewords recorded on 
the magnetic medium; and 
detecting whether an impermissible combina- 
tion of codewords has been read from the mag- 
netic medium. 

15. The method of claim 14 further comprising the step 
of providing an alarm indicating that an impermissi- 
ble combination of codewords has been read from 
the magnetic medium. 

16. A method of generating high rate codes for encod- 
ing digital information comprising the steps of: 

receiving a signal indicating a desired average 
code rate of (n + d/2 m ) user bits per codeword, 
wherein n is a positive integer, and d is a posi- 
tive integer less than 2 m ; 
generating a plurality of non-intersecting sub- 
constellations of dc-free codewords, wherein 
the number and size of the subconstellations 
are determined according to the average code 
rate; 

assigning dc-free codewords to each of said 
subconstellations, wherein codewords with the 
most frequent bit transitions are assigned to dif- 
ferent subconstellations from codewords with 
the least frequent bit transitions; 
assigning a q-bit address to each one of said 
dc-free codewords, where q = log 2 Q, and Q is 
the size of the respective subconstellation to 
which the codeword was assigned; 
storing in memory each of said codewords, the 
correspondence between the codewords and 
the subconstellations, and the address as- 
signed to each respective codeword; 
receiving a sequence of (2 m n + d) user bits; 
selecting 2 m of the dc-free codewords based 
upon the received sequence of user bits; and 
generating a sequence of bits corresponding to 
the selected 2 m dc-free codewords. 

17. The method of claim 16 wherein the step of select- 
ing comprises the steps of: 

identifying from which of the plurality of subcon- 
stellations each of the 2 m dc-free codewords 
will be selected based upon a portion of the se- 
quence of user bits in a predetermined order; 
and 

specifying the 2 m dc-free codewords based up- 
on the remaining user bits in the sequence. 

18. The method of claim 17 wherein the step of gener- 
ating a plurality of non-intersecting subconstella- 
tions comprises the steps of: 

creating a constellation tree corresponding to 



the rate wherein the number of levels in the con- 
stellation tree, excluding a root node, equals 
the number of indices i for which dj is not equal 
to zero, where 

wherein each node, except leaf nodes at the 
lowest level of the constellation tree, has a re- 
spective first and second branch, and wherein 
each leaf node corresponds to one of the plu- 
rality of non-intersecting subconstellations; 
labelling each respective first branch with a 
weight of zero; 

labelling each respective second branch in the 
jth level of the constellation tree with a weight 
(m - P(j))» where p(j) is the jth indice i for which 
dj is not equal to zero; 

determining the respective relative weights r(i) 
of each of said leaf nodes by summing the 
weights along each path of branches from the 
root node to each respective one of said leaf 
nodes; and 

determining the number of dc-free codewords 
to be assigned to each of said subconstella- 
tions according to 2 n 2" r <'). 

The method of claim 16 wherein the step of assign- 
ing dc-free codewords to each of said subconstel- 
lations further comprises the step of assigning 
codewords with the least bit transitions to the small- 
er subconstellations. 

The method of claim 1 9 wherein the step of assign- 
ing dc-free codewords to each of said subconstel- 
lations comprises the step of assigning codewords 
with the most bit transitions to the larger subcon- 
stellations. 

The method of claim 16 wherein the step of assign- 
ing dc-free codewords to each of said subconstel- 
lations comprises the step of assigning codewords 
having a length of N bits, where N is the smallest 
number such that 




is greater than or equal to the total number of code- 
words in the plurality of subconstellations. 

55 22. A coder for encoding digital information at an aver- 
age code rate of (n + d/2 m ) user bits per codeword, 
wherein n is a positive integer, and d is a positive 
integer less than 2 m , the apparatus comprising: 
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a memory comprising a plurality of non-inter- 
secting sub-constellations of dc-free code- 
words, wherein each codeword is assigned to 
one of the subconstellations, wherein the sub- 
constellations are of different sizes, and where- s 
in codewords with the least frequent bit transi- 
tions are assigned to the smaller subconstella- 
tions; 

means for selecting, in response to a received 
sequence of (2 m n + d) user bits, 2 m dc-free to 
codewords from among the plurality of subcon- 
stellations, wherein the subconstellations are 
used in a non-equiprobable manner such that 
a particular codeword from a larger subconstel- 
lation is more likely to be selected than a par- *5 
ticular codeword from a smaller subconstella- 
tion; and 

means for generating a sequence of bits corre- 
sponding to the selected 2 m dc-free code- 
words. 20 



23. The coder of claim 22 wherein codewords with the 
most frequent bit transitions are assigned to larger 
subconstellations. 

25 

24. The coder of claim 22 further comprising: 

means for identifying from which of the plurality 
of subconstellations each of the 2 m dc-free 
codewords will be selected based upon a por- 30 
tion of the sequence of user bits; and 
means for specifying the 2 m dc-free codewords 
based upon the remaining user bits in the se- 
quence. 

35 

25. The apparatus of claim 24 wherein the number of 
subconstellations is determined by creating a con- 
stellation tree based at least upon the value of d, 
wherein 

40 

d = £ (d^M 

i-o 

and wherein a set P represents the ordering of in- *s 
dices for which dj is not equal to zero, wherein the 
means for identifying comprises a processor pro- 
grammed to partition, for each member of the set P, 
2 m as-yet unspecified codewords into sets of non- 
overlapping subdivisions, and programmed to de- 50 
termine, for each of said subdivisions, based upon 
at least one of the user bits in said portion of the 
sequence of user bits, whether ail of the as-yet un- 
specified codewords in a particular subdivision pro- 
ceed down respective first branches in the constel- 55 
lation tree, or whether exactly one of the as-yet un- 
specified codewords in the particular subdivision 
proceeds down a respective second branch in the 



constellation tree while the remaining as-yet un- 
specified codewords in the particular subdivision 
proceed down the respective first branches. 

26. The coder of claim 22 wherein the number and size 
of the subconstellations stored in the memory de- 
pend upon the average code rate. 

27. The coder of claim 26 wherein the memory stores 
a q-bit address for each of the codewords, where q 
= log 2 Q, and Q is the size of the respective sub-con- 
stellation to which the codeword was assigned. 

28. The coder of claim 22 further comprising means for 
recording the sequence of bits corresponding to the 
selected 2 m dc-free codewords on a magnetic me- 
dium. 

29. An apparatus for encoding digital information com- 
prising: 

means for receiving a signal indicating an av- 
erage code rate of (n + d/2 m ) user bits per code- 
word, wherein n is a positive integer, and d is a 
positive integer less than 2 m ; 
means for generating a plurality of non-inter- 
secting subconstellations of dc-free code- 
words, wherein the subconstellations are of dif- 
ferent sizes, and wherein the number and size 
of the subconstellations are determined ac- 
cording to the average code rate; 
means for assigning dc-free codewords to each 
of said subconstellations, wherein codewords 
with the most frequent bit transitions are as- 
signed to the larger subconstellations and 
codewords with the least frequent bit transi- 
tions are assigned to the smaller subconstella- 
tions; 

a memory for storing said plurality of subcon- 
stellations and for storing the correspondence 
between said codewords and said subconstel- 
lations; 

means for selecting 2 m of the dc-free code- 
words in response to a received sequence of 
(2 m n + d) user bits; and 
means for generating a sequence of bits corre- 
sponding to the selected 2 m dc-free code- 
words. 

30. The apparatus of claim 29 wherein the means for 
identifying comprises a processor programmed to 
execute an algorithm which uses the subconstella- 
tions in a non-equiprobable manner such that a par- 
ticular codeword from a larger sub-constellation is 
more likely to be used than a particular codeword 
from a smaller constellation. 

31. The apparatus of claim 29 wherein the means for 
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assigning dc-free codewords to each of said sub- 
constellations comprises means for assigning 
codewords having a length of N bits, where N is the 
smallest number such that 

(n/ 2 ) 

is greater than or equal to the total number of code- 
words in the plurality of subconstellations. 

32. The apparatus of claim 30 wherein the means for 
assigning dc-free codewords to each of said sub- 
constellations comprises means for assigning 
codewords with more bit transitions to larger sub- 
constellations than codewords with fewer bit transi- 
tions. 

33. An apparatus for decoding digital information re- 
corded on a magnetic medium, wherein the digital 
information comprises 2 m dc-free codewords corre- 
sponding to an original sequence of (2 m n + d) user 
bits, wherein n is a positive integer, and d is a pos- 
itive integer less than 2 m , the apparatus comprising: 

means for receiving said 2 m dc-free code- 
words; 

means for recovering the (2 m n + d) user bits 
based upon the received 2 m dc-free code- 
words; and 

means for generating a sequence of bits corre- 
sponding to the recovered user bits. 

34. The apparatus of claim 33 comprising: 

a look-up table storing a set of dc-free code- 
words, including said 2 m dc-free codewords, 
and storing respective address bits associated 
with each of the codewords in the set and fur- 
ther storing the identity of a respective one of a 
plurality of subconstellations to which each of 
the codewords in the set belongs; 
means for retrieving the respective address bits 
associated with each of the 2 m dc-free code- 
words; and 

means for arranging the retrieved address bits 
in the order in which the corresponding 2 m dc- 
free codewords were received. 

35. The apparatus of claim 34 comprising means for re- 
covering the remaining user bits in the original se- 
quence of user bits based upon the combination 
and order of subconstellations to which the 2 m re- 
ceived dc-free codewords belong. 

36. The apparatus of claim 35 further comprising 
means for detecting whether an impermissible com- 



bination of codewords has been received. 

37. The apparatus of claim 36 further comprising a cir- 
cuit for providing a signal indicating that an imper- 

5 missible combination of codewords has been de- 
tected. 

38. A method of decoding digital information recorded 
on a magnetic medium, wherein the digital informa- 

10 tion comprises 2 m dc-free codewords correspond- 
ing to an original sequence of (2 m n + d) user bits, 
wherein n is a positive integer, and d is a positive 
integer less than 2 m , the method comprising the 
steps of: 

15 

receiving said 2 m dc-free codewords; 
recovering the (2 m n + d) user bits based upon 
the received 2 m dc-free codewords; and 
generating a sequence of bits corresponding to 
20 the recovered user bits. 

39. The method of claim 38 wherein the step of recov- 
ering comprises: 

25 retrieving respective address bits associated 

with each of the 2 m dc-free codewords; and 
arranging the retrieved address bits in the order 
in which the corresponding 2 m dc-free code- 
words were received. 

30 

40. The method of claim 39 wherein the step of retriev- 
ing comprises retrieving said address bits from a 
look-up table storing a set of dc-free codewords, in- 
cluding said 2 m dc-free codewords, and storing said 

35 respective address bits associated with each of the 
codewords in the set and further storing the identity 
of a respective one of a plurality of subconstella- 
tions to which each of the codewords in the set be- 
longs. 

40 

41. The method of claim 40 wherein the step of recov- 
ering further comprises the step of recovering the 
remaining user bits in the original sequence of user 
bits based upon the combination and order of sub- 

45 constellations to which the 2 m received dc-free 
codewords belong. 

42. The method of claim 41 further comprising the step 
of detecting whether an impermissible combination 

so of codewords has been received. 

43. The method of claim 42 further comprising the step 
of providing a signal indicating that an impermissi- 
ble combination of codewords has been detected. 
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CREATE CONSTELLATION TREE CORRESPONDING TO 
RATE (n + ^y) WHERE THE NUMBER OF LEVELS, EXCLUDING 
200 —4 THE ROOT NODE, EQUALS THE NUMBER OF INDICES i FOR 
WHICH dj IS NOT EQUAL TO ZERO WHERE d = X Q (d'2i), 

AND WHERE EACH NODE, EXCEPT LEAF NODES AT THE 
LOWEST LEVEL, HAS A FIRST AND SECOND BRANCH 



204 — 



I 



LABEL EACH FIRST BRANCH 
WITH A WEIGHT OF ZERO 



206— 



I 



LABEL EACH SECOND BRANCH IN THE jTH LEVEL 
WITH A WEIGHT (m - p(j)), WHERE p(j) IS THE jTH 
INDICE i FOR WHICH d, , s N0T equal TO ZERO 



210— 



i 



DETERMINE RELATIVE WEIGHT, r(i), OF EACH 
SUBCONSTELLATION BY SUMMING THE WEIGHTS ALONG 
THE PATH OF BRANCHES FROM THE ROOT NODE TO THE 
LEAF NODE CORRESPONDING TO THE SUBCONSTELLATION 



220— 



DETERMINE SIZE OF EACH SUBCONSTELLATION 
ACCORDING TO 2 n2 <*t 



230— 



T 



ASSIGN CODEWORDS TO 
SUBCONSTELLATIONS 



235 I ASSIGN q-BIT ADDRESS 

TO EACH CODEWORD 



250— 



STORE ASSIGNED CODEWORDS. THE 
CORRESPONDENCE BETWEEN THE 
CODEWORDS AND THE SUBCONSTELLATIONS 
AND THE ADDRESS ASSIGNED TO EACH 
CODEWORD. IN THE MEMORY 



FIG. 2 
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232 — 



DISCARD LEAST DESIRABLE CODEWORDS, 
SUCH AS CODEWORDS WITH LONG STRINGS OF 
CONSECUTIVE BITS HAVING THE SAME VALUE 
AND APPEARING AT THE BEGINNING OR 
END OF THE CODEWORD 

I 



ASSIGN OTHER LESS DESIRABLE CODEWORDS, 
234—1 SUCH AS CODEWORDS WITH LONG STRING OF 
CONSECUTIVE BITS HAVING THE SAME VALUE, 
TO SMALLER SUBCONSTELL ATIONS 

i 



236 — 



ASSIGN MOST DESIRABLE CODEWORDS, SUCH 

AS CODEWORDS WITH SHORT STRING OF 
CONSECUTIVE BITS HAVING THE SAME VALUE, 
TO LARGER CONSTELLATIONS 

I 



238 — 



ASSIGN REMAINING SIMILAR CODEWORDS 
TO DIFFERENT SUBCONSTELLATIONS 



FIG. 2A 



SUBCONSTELLATION RELATIVE SIZE ACTUAL PROBABILITY 



&0Q 
10 

mi 
a 



128 


= 0.53 


235 






= 0.27 


240 




32 


= 0.13 


240 




16 


= 0.07 


240 





0.66 
0.22 
0.09 
0.03 



FIG. 4 
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300— 



FOR EACH MEMBER p OF THE SET P, 
PARTITION THE SEQUENCE OF 2 m AS-YET 
UNSPECIFIED CODEWORDS INTO 2p 
NONOVERLAPPING SUBDIVISIONS 



I 



310— 



DETERMINE FROM WHICH SUBCONSTELLATION 
EACH AS-YET UNSPECIFIED CODEWORD WILL BE 
SELECTED BASED UPON THE SEQUENCE OF USER 
BITS ACCORDING TO THE ENCODING ALGORITHM 



320— 



USE LOOK-UP TABLES TO DETERMINE WHICH 

CODEWORDS ARE SPECIFIED FROM THE 
SELECTED SUBCONSTELLATIONS BASED UPON 
THE REMAINING SEQUENCE OF USER BITS 



330— 



T 



GENERATE SEQUENCE OF BITS CORRESPONDING 
TO THE SPECIFIED CODEWORDS 



340— 



TRANSMIT SPECIFIED CODE- 
WORDS TO THE WRITE HEAD 



350— 



RECORD THE CODEWORDS 
ON THE MAGNETIC MEDIUM 



FIG. 5 
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ASSUME THAT THE 2 m AS-YET 
UNSPECIFIED CODEWORDS ARE AT THE ROOT 
NODE OF THE CONSTELLATION TREE 



312 — 



ADDRESS NEXT PARTITIONED SUBDIVISION, 
IN ASCENDING ORDER OF p 



315- 




314 

-A. 



ALL AS-YET 
UNSPECIFIED CODEWORDS 
IN THE SUBDIVISION 
PROCEED DOWN THE 
FIRST BRANCH FROM 
THEIR RESPECTIVE 
CURRENT POSITIONS 



USE THE NEXT (m - p) USER BITS TO IDENTIFY 
WHICH ONE OF THE AS-YET UNSPECIFIED 

CODEWORDS IN THE SUBDIVISION 
PROCEEDS DOWN THE SECOND BRANCH 
FROM ITS CURRENT POSITION 



THE REMAINING AS-YET UNSPECIFIED CODEWORDS 
316—1 in THE SUBDIVISION PROCEED DOWN THE FIRST 
BRANCH FROM THEIR RESPECTIVE POSITIONS 



317 



HAVE ALL 
.PARTITIONED SUBDIVISIONS, 
BEEN ADDRESSED 7, 



NO 



YES 



318 




END ENCODING 
ALGORITHM 



ngN 

is 



FIG. 5A 
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400— 



READ THE 2 m DC-FREE CODEWORDS 
FROM THE MAGNETIC RECORDING MEDIUM 



404— 



406 — 



RETRIEVE THE q-BIT ADDRESS 
ASSOCIATED WITH EACH 
OF THE 2 m CODEWORDS 

x 



408 — 



ARRANGE THE RETRIEVED ADDRESS 
BITS IN THE ORDER IN WHICH THE 
CORRESPONDING CODEWORDS 
WERE RECEIVED 



RETRIEVE IDENTITY OF 
SUBCONSTELLATIONS TO WHICH 
EACH OF THE CODEWORDS BELONGS 



410 



I 



415 



IS THE ORDER AND 
COMBINATION OF SUBCONSTELLATIONS 
FROM WHICH THE RECEIVED CODEWORDS 
WERE SELECTED PERMISSIBLE ? 





PROVIDE AN 


NO 


ERROR 




DETECTION 




SIGNAL 



420— 



YES 



RECOVER REMAINING SEQUENCE OF USER 
BITS BASED UPON COMBINATION AND ORDER 
OF SUBCONSTELLATIONS FROM WHICH THE 

RECEIVED CODEWORDS WERE SELECTED 



I 



APPEND THE REMAINING SEQUENCE 

422 I OF USER BITS TO THE BEGINNING 

OF THE ARRANGEMENT OF BITS 
FROM STEP 406 

1 



FIG. 7 



RECOVER THE DIGITAL INFORMATION 
425—1 CORRESPONDING TO THE USER BITS 
USING REED-SOLOMON DECODING 
AND LEMPEL-ZIV DECOMPRESSION 
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